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Abstract 

Simulating lattice QCD with chiral fermions and indeed using Domain Wall Fermions continues to be challenging project 
however large are concurrent computers. One obvious bottleneck is the slow pace of prototyping using the low level coding 
which prevails in most, if not all, lattice projects. Recently, we came up with a new proposal, namely QCDLAB, a high level 
language interface, which we believe will boost our endeavours to rapidly code lattice prototype applications in lattice QCD using 
MATLAB/OCTAVE language and environment. The first version of the software, QCDLAB 1.0 offers the general framework on 
how to achieve this goal by simulating set of the lattice Schwinger model |http : / /phys . f shn . edu . al/qcdlab . htmlj 
In this talk we introduce QCDLAB 1.1, which extends QCDLAB 1 .0 capabilities for real world lattice computations with Wilson 
and Domain Wall fermions. 
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1 The challenge of Domain Wall Fermions computations 

Lattice chiral symmetry is an important ingredient for light fermion physics. In the early 1990s there were generally two seem- 
ingly different formulations of chiral fermions on the lattice: Domain Wall Fermions | Kaplan 1992 IFurman and Shamir 199511 



and Overlap Fermions [Narayanan and Neuberger 1993 Neuberger 19981. Introduction of Truncated Overlap Fermions [Borigi 2000c) 



a generalisation of Domain Wall Fermions, made possible to establish the equivalence between these formulations [Borigi 2000 



|Borigi 2005a) . More recently, Moebius Fermions IIBrower et al 20041 . a parametric generalisation of Domain Wall Fermions, 
allow for more flexibility in reducing the chiral symmetry breaking effects. 

Either way, the lattice chiral fermion requires the introduction of an extra dimension coupled to the four other dimensions 
of the standard lattice theory. Hence, Domain Wall Fermions, as a 5D problem, poses a computational challenge. Well known 
algorithms knowing to be working for standard fermions are no more useful and one is forced to use CGNE, which converges 
slowly. Therefore, it is important to search for faster algorithms. In this talk, we present QCDLAB 1.1, an algorithmic research 
tool for 4 and 5 dimensional fermions. The tool, a MATLAB/OCTAVE based environment, allows fast prototyping of linear 
algebraic computations and thus accelerates the process of finding the most efficient fermion algorithm. 



2 The philosophy behind QCDLAB 

Lattice QCD, an industrial-range computing project, is in its fourth decade. It has basically two major computing problems: sim- 
ulation of QCD path integral and calculation of quark propagators. Generally, these problems lead to very intensive computations 
and require high-end computing platforms. 

However, we wish to make a clear distinction between lattice QCD prototyping and production codes. This is very important 
in order to develop a compact and easily manageable computing project. While this is obvious in theory, it is less so in lattice 
QCD practising: those who write lattice codes are focused primary on writing production codes. 

The code of a small project is usually small, runs fast, it is easy to access, edit and debug. Can we achieve these features for 
a lattice prototyping code? Or, can we modify the goals of the lattice project in order to get such features? In our opinion, this 
is possible for a prototyping code, a minimal possible code which is able to test gross features of the theory and algorithms at 
shortest possible time and largest acceptable errors on a standard computing platform. This statement needs more explanation: 

a. Although it is hard to give sharp constraints on the number of lines of the prototyping code, we would call "minimal" that 
code which is no more than a few printed pages. 

b. The run time depends on computing platforms and algorithms, and the choice of lattice action and parameters. It looks 
Uke a great number of degrees of freedom here, but in fact there are hardly good choices in order to reduce the run time 
of a prototyping code without giving up certain features of the theory. Again, it is difficult to give run times. However, a 
"short" run time should not exceed a few minutes of wall-clock time. 

c. We consider a computing platform as being "standard" if its cost is not too high for an academic computing project. 

d. We call simulation errors to be the "largest acceptable" if we can distinguish clearly signal form noise and when gross 
features of the theory are not compromised by various approximations or choices. 

e. Approximations should not alter basic features of the theory. The quenched approximation, for example, should not be 
considered as an acceptable approximation when studying QCD with light quarks. 

A prototyping code with these characteristics should signal the rapid advance in the field, in which case, precision lattice 
computations are likely to happen in many places around the world. Writing a minimal prototyping code is a challenge of three 
smarts: smart computers, smart languages and smart algorithms. 

QCDLAB project is based on the philosophy described above. In the following we first describe briefly QCDLAB 1.0 and 
then the 1 . 1 version. 
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3 QCDLAB 1.0 



QCDLAB is a high level language tool a collection of MATLAB functions for the simulation of lattice Schwinger model. It can 
be used as a small laboratory to test and validate algorithms. In particular, QCDLAB 1.0 serves as an illustration of the minimal 

prototyping code concept. 

QCDLAB 1.0 can also be used for newcomers in the field. They can learn and practice lattice projects which are based on 
short codes and run times. This offers a "learning by doing" method, perhaps a quickest route into answers of many unknown 
practical questions concerning lattice QCD simulations. 

The next two sections describe basic algorithms for simulation of lattice QCD and foundations of Krylov subspace methods. 
Then, we present the QCDLAB 1 .0 functions followed by examples of simple computing projects. The last section outlines the 
future plans of the QCDLAB project.for lattice QCD computations. It is based on the MATLAB and OCTAVE language and 
environment. While MATLAB is a product of The MathWorks, OCTAVE is its clone, a free software under the terms of the 
GNU General Public License. 

MATLAB/OCTAVE is a technical computing environment integrating numerical computation and graphics in one place, 
where problems and solutions look very similar and sometimes almost the same as they are written mathematically. Main 
features of MATLAB/OCTAVE are: 

• Vast Build-in mathematical and linear algebra functions. 

• Many functions form Bias, Lapack, Minpack, etc. libraries. 

• State-of-the-art algorithms. 

• Interpreted language. 

• Dynamically loaded modules from other languages like C/C-H-, FORTRAN. 
QCDLAB 1.0 is 

• General functions: 

- Solvers: BiCGgS, BiCGstab, CG, CGNE, FOM, GMRES, Lanczos, SCG, SUMR 

- Data processing: Autocorel, Binning. 

• Specialised functions for the Schwinger model: 

- Simulation: HMC_W, HMC_KS, Force_W, Force_KS 

- Operators: DiracKS, Dirac_r, Dirac_W, cdotS 

- Measurements: wloop 

a collection of functions for the simulation of lattice Schwinger model. It can be used as a small laboratory to test and validate 
algorithms. In particular, QCDLAB 1.0 serves as an illustration of the minimal prototyping code concept. 

QCDLAB 1.0 can also be used for newcomers in the field. They can learn and practice lattice projects which are based on 
short codes and run times. This offers a "learning by doing" method, perhaps a quickest route into answers of many unknown 
practical questions concerning lattice QCD simulations. It contains the following MATLAB/OCTAVE functions: 

Autocorel BiCGgS BiCGstab Binning cdotS 

CG CGNE Dirac-KS Dirac_r Dirac_W 

FOM Force_KS Force.W GMRES HMC_KS 

HMC_W Lanczos SCG SUMR wloop 

The functions can be grouped as in the following: 

• General functions: 

- Solvers: BiCGgS, BiCGstab, CG, CGNE, FOM, GMRES, Lanczos, SCG, SUMR 

- Data processing: Autocorel, Binning. 

• Specialised functions for the Schwinger model: 



3 



- Simulation: HMC_W, HMC.KS, Force_W, Force_KS 

- Operators: Dirac_KS, Dirac_r, Dirac_W, cdotS 

- Measurements: wloop 

In order to get started with QCDLAB 1.0 one can run the following three projects: 

• For Matlab: MPro jectl , MProject2, MProjectS 

• For Octave: OPro jectl, 0Project2, OProjectS 

The user is required to download these m-files to the working directory of MATLAB/OCTAVE and then type the correspond- 
ing project names in the MATLAB/OCTAVE environment. The first project is a simulation project, the second one is a linear 
system and eigenvalue solver project, whereas the third one is a linear system and eigenvalue solver project for chiral fermions. 

For more information on QCDLAB 1 .0 the reader is referred to the complete documentation available at the project web page 
|http : / / phys .fshn.edu.al/ qcdlab . html, 



As stated in the first section, the goal of the QCDLAB project is to create an algorithmic prototyping environment for lattice 
QCD computations. This goal has no ending station, but rather it is a process which will enhance the computing capabilities as 
the time goes on. The first step in this direction is the 1 . 1 version. It offers new functionality for 4D and 5D computations at the 
linear system and eigenvalue solver level. The new functions are: 



There are two ways to implement the Wilson operator: 

• Classical way of matrix-vector multiplication using Mult_Dirac_W, Mult_Dirac_W_H 

• Creation of a sparse matrix using Dirac4 

In the first case one needs to initialise the Wilson operator using the Initial ise_Dirac_W function and to hack inversion 
solvers of QCDLAB 1.0 in the place where the multiplication with A takes place, eg. 

A*x Mult_Dirac_W (x) 

The same comment is for the Domain Wall Fermion matrix-vector functions Mult_DWF, Mult_DWF_W. Note that the 1.0 
version of the cdotS function is now redefined for the 4D fermion theory and is included in the 1.1 version. Other useful func- 
tions are multiplication of a 4D-vectorby 75: mult_gamma5, the chiral projection functions applied to 4D-vectors: PSminus , 
PSplus. 

Gauge field data structure 

Both Dirac4 and Init ialise_Dirac_W need the gauge field, which must be supplied as a set of four matrices: ul, u2, u3, 
u 4 . Each gauge field component is a 9A^ x 2 matrix, where N is the total number of lattice sites. The first column is the real part 
and the second the imaginary part of the particular SU(3) matrix element. Note that, if reshaped, the most inner dimensions of 
gauge field component are 3 x 3 matrices, i.e. 

reshape {ul,3,3,N,2} 

The user should care to organise the gauge field data in this way for the QCDLAB functions to work as required. 
Sparse Wilson matrices 

In case one want to create a sparse Wilson matrix one uses the Dirac4 function: 



4 QCDLAB 1.1 



cdotS 

Mult_Dirac_W 
mult_gamma5 



Dirac4 

Mu 1 t_D i r a c_W_H 
PSminus 



Initialise_Dirac_W 

Mult_DWF 

PSplus 



InversePower 

Mult_DWF_H 

PowerMethod 
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0] ; 

0] 
0] 
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Pl_minus=eye (4) -gammal; 
P2_minus=eye (4) -gamma2; 
P3_minus=eye ( 4 ) -gamma3 ; 
P4_minus=eye ( 4 ) -gamma4 ; 



function A=Dirac4 (ul , u2 , u3, u4 ) ; 
% Constructs Wilson-Dirac operator 
mass=0; Nl=8; N2=8; N3=8; N4=16; N=N1*N2*N3*N4; 

% Gamma matrices 

gammal = [0, 0, 0,-i; 0, 0,-i, 0; 0, 

gamma2 = [0, 0, 0,-1; 0, 0, 1, 0; 0, 
gamma 3 = [0, 0,-i, 0; 0, 0, 0, i; i, 
gamma4 = [0, 0,-1, 0; 0, 0, 0,-1; -1, 

% Projection operators 
Pl_plus = eye ( 4 ) tgammal ; 
P2_plus = eye ( 4 ) +gamma2 ; 
P3_plus = eye ( 4 ) +gamma3 ; 
P4_plus = eye ( 4 ) +gamma4 ; 
% Shift operators 

pl= [Nl, 1 :N1-1] ; p2= [N2 , 1 : N2-1 ] ; p3= [N3, 1 : N3-1 ] ; p4= [N4 , 1 : N4-1 ] ; 
Il=speye (Nl) ; I2=speye (N2 ) ; I3=speye (N3 ) ; I4=speye (N4 ) ; 
Tl=Il(:,pl); T2=l2(:,p2); T3=l3(:,p3); T4=I4(:,p4); 
El=spkron (14, spkron (13, spkron (12, spkron (Tl, speye (3) ) ) ) ) 
E2=spkron (14, spkron (13, spkron (T2, spkron (II, speye (3) ) ) ) ) 
E3=spkron (14, spkron (T3, spkron (12, spkron (II, speye (3) ) ) ) ) 
E4=spkron (T4, spkron (13, spkron (12, spkron (II, speye (3) ) ) ) ) 
% Gauge Field configuration {ul, u2, u3, u4 } : 9*N by 2 
I_N=speye (N) ; 

[I, J] =sp find (spkron (I_N, ones (3) ) ) ; 
Ul=sparse ( I , J, ul ( : , 1 ) +i*ul ( : , 2 ) , 3*N, 3*n) ; 
U2=sparse (I, J, u2 ( : , 1) +i*u2 ( : , 2) , 3*N, 3*N) ; 
U3=sparse (I, J, u3 ( : , 1) +i*u3 ( : , 2 ) , 3*N, 3*N) ; 
U4=sparse (I, J, u4 ( : , 1) +i*u4 ( : , 2 ) , 3*N, 3*N) ; 
% Upper triangular 

U=spkron (Pl_minus, U1*E1 ) + spkron (P2_minus, U2*E2 ) + spkron (P3_minus, U3*E3 ) + spkron (P4_minus, U4*E4 ) ; 
% Lower triangular 

L=spkron (Pl_plus , U1*E1) +spkron (P2_plus , U2*E2 ) +spkron (P3_plus , U3*E3) +spkron (P4_plus ,U4*E4); 
%M=U+L' ; 

A= (mass+4) *speye (12*N) -0 . 5* (U+L' ) ; 

% Copyright (C) 2006-2007 Artan Borici. 

% This program is a free software licensed under the terms of the GNU General Public License 



matrices 



Eigenvalue solvers 

The 1.1 version comes with two eigenvalue solvers: PowerMethod, InversePower, which are implementations of the 
methods with the same name. They can be used for the Hermitian eigenvalue problems. For example, if one would like to 
compute the smallest eigenvalue of the Hermitian Wilson operator one can use the InversePower function: 

function [v, lambda, rr] =InversePower (b, xO, tol, nmax) ; 

% Inverse power method for the Hermitian Wilson operator 

v=b/norm(b); rr=[]; 

while 1, 

u=bicg5 (v, xO, le-6, 1000) ; 

u=mult_gamma5 (u) ; 

lambda=v' *u; 

r=v-u/ lambda; 

rnorm=norm (r ) ; rr= [ rr ; rnorm] ; 
if rnorm<tol, break, end 
v=u/norm (u) ; 

end 

% Copyright (C) 2006-2007 Artan Borici. 

% This program is a free software licensed under the terms of the GNU General Public License 
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Domain Wall Fermion operator 



The Mult_DWF implements the Domain Wall Fermion operator 



-mP- \ 



M = 



l-Dw 



\ -mP+ 



P- t-DwJ 



applied to a vector: 

function y=Mult_DWF (x,N5) ; 

% Multiplies a vector by the Domain Wall Fermion matrix 
global N mass_dwf 

x=re shape (x, 12*N,N5) ; 



Y ( : , 1) =x ( : , 1) -Mult_Dirac_W (x ( : , 1) ) +P5plus (x ( : , 2) ) -mass_ciwf *P5minus (x { : ,N5) ) ; 
for j5=2:N5-l; 

Y ( : , j5) =x ( : , j5) -Mult_Dirac_W (x ( : , j5) ) +P5plus (x ( : , j5+l) ) +P5minus (x ( : , j5-l) ) ; 
end 

Y ( : ,N5) =x ( : ,N5) -Mult_Dirac_W (x ( : ,N5) ) -mass_dwf *P5plus (x ( : , 1) ) +P5minus (x ( : ,N5-1) ) ; 

x=reshape (x, 12*N*N5, 1) ; 
Y=reshape (y, 12*N*N5, 1) ; 



% CopYright (C) 2006-2007 Artan Borici. 

% This program is a free software licensed under the terms of the GNU General Public License 
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